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ABSTRACT 



This paper discusses the use of contrast coefficients 
in multiple linear regression models, and shows how they can provide 
for a logical method of analysis in both the analysis of variance and 
the analysis of covariance. (CK) 
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The T Jse of Contrast Coding to 
Simplify ANOVA and ANCOVA Procedures 
in Multiple Linear Regression 

Cohen (3 Q 6G) presented a discussion of contrast coding in multiple linear 
regression models for use in analysis of variance (AITOVA) and analysis of 
covariance (ANGOLA) . The general theme of Cohen’s article was that the main 
effects and interaction of ANOVA and ANCOVA can be reflected in a linear model 
through the use of specifically coded predictor vectors* °ther writers have 
referred to these vectors as dummy vectors j nonsense coded vectors > or group 
membership vectors. In our worT: with multiple regression, we have found 
Cohen's system of contrast coding to provide a very logical and relatively 
simple method for developing regression models to answer more specific questions 
than the overall main effects and interaction tests generally applied in ANOVA. 
One purpose of this paper is to present a discussion of the use of contrast 
coding to reflect orthogonal comparisons. 

T r e have also found that, as Cohen suggests, contrast coding can easily be 
applied in ANCOVA. Further, we found that for a two-way analysis of covariance, 
contrast coding leads to a more exact duplication of traditional analysis of 
covariance than does the standard method of designating group membership 
predictor vectors# A second purpose of this paper is to present a discussion 
of the application of contrast coding to ANCOVA# 

Analysis of Variance 

Consider an experiment in which two treatment conditions are to be compared. 
In this case, Winer (1962) indicates that each individual score results from a 
number of sources of variability. According to Winer, 
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i i 



= M 



t + e, . 
1 *3 






>!., = an observation on person i under treatment j 
il 

u = grand mean of all potential observations 
t = effect of treatment j 



j 

= error associated with 

In order to answer the question, of whether there is a significant difference 



between Treatments 1 and 2 in a standard regression model (Bottenberg & Ward, 
1963 * Kelly, et.al., 1969), one would employ the following full model: 

Model 1 Y = a 0 U + a 1 X 1 + a 2 X^ + 

Inhere: Y = vector of criterion scores 

TJ - unit vector (all elements are 1) 

X 1 s 1 if the corresponding criterion score comes from 
Treatment 1; 0 othervi 

X = 1 if the corresponding criterion score comes from 
^ Treatment 2: 0 otherwise 

E^ = error vector 

a , a 9 a - partial regression weights 
0 12 

It will be noted that Y corresponds to Winer's j ; to Winer's u ; 
a^ and a.2 to Winer's t j and E^ to Winer's - To determine i x ax. 

difference exists between Treatments 1 and 2, M odel 1 would be compared to a 
restricted model (Model 99) which would contain only the unit vector as a 
predictor vector and the error vector. 

Model 99 Y = a U + E 
0 

Using contrast coding to reflect Treatments 1 and 2, the following full 
model would result: 
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Model 2 Y = a'l + a X + E 
0 11 2 

Where: Y = criterion scores 

U = unit vector 

= 1 if criterion from Treatment 1 or -1 if criterion from 
Treatment 2 

= error vector 

a^, a i = Partial regression weights 

To answer the question as to whether or not Treatments 1 and 2 are different, 
T *odel 2 would he compared to Model 99* 

The advantage of contrast coding in the above example seems to bo in the 
determination of degrees of freedom* It will be noted that the analysis in 
this example consists of a simple t-test or an F-test with one degree of 
freedom in the numerator. In order to perform this analysis? one must sot 
a^ and from Model 1 equal to 0* This loss of two vectors results in a loss 
of only one degree of freedom because there is a linear dependency existing 
within the set of vectors U, X^, and X 2 in Model 1* In Model 2 S no linear 
dependencies exist in the predictor variables* As a result, the t r " a 
significant difference between i_eaanenLS ^ and 2 is accomplished simply by 
setting = 0. As a result, the restriction of one regression weight accurately 
reflects the appropriate number of degrees of freedom for this analysis* 

If one were t< expand the above two-group example to include four treatment 
conditions , the advantages of contrast coding in ANOVA become more apparent. I 
Treatments 3 and 4 *ire added, the addition of X and X to Tf odel 1 would be 

3 4 

required in order to allow for the main effects of Treatments 3 and 4. Model 
1 would then be revised to be 
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Model 3 Y = a U + a X + a X + a X m a X + E 
0 11 22 33 44 3 

Where ^ Y, U, X^, X? and Eg are as defined in Model 1 

Xg = 1 if 0 otherwise 

X = 1 if T. , n otherwise 
4 4 

a n , a^, a 2> a^, a^ « partial regression weights 
To test for an overall main effect of treatments, the following restriction would 
be placed on Model 3: 




If can again be seen that there is one more predictor vector restricted out than 
degrees of freedom lost. Further, it 8hould be noted that within this regression 
model framework, the overall treatment main effects is the only question which 
can be asked and tested. 

Using contrast coding, ,f odel 4 might be used to reflect the various 
treatment conditions . 

Model 4 Y « a U + a X + a X + a X + E, 

0 1 1 2 2 3 3 4 

TTI ere: Y = criterion scores 

U = unit vector 
= error vector 

a 0 , a^, a 0 , a^ * partial regression weights 

and where the elements in X, , X„ and X reflect the linear, quadratic 

1 2, 3 

and cubic trends and. are as follows: 



If criterion from 
Treatment 
1 

2 

3 

4 



-3 

-1 

1 

3 



1 

-1 

-1 



-1 

3 

-3 

1 
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The elements presented here sre the standard coefficients for orthogonal 
polynomials • The use of these values would result in X^, and being 

uncorrelated. As a result s it is possible to partition the variance into its 
three independent sources. 

If one were concerned about asking the overall main effect question, it 

would be necessarv to set a- = a = a o - 0. The test of significance would 

i 2 ^ 

result in precisely the same outcome as the use of 1 and 0 group membership 
vectors as presented in Model 3. However, it is possible to ask more specific 
questions given orthogonal coefficients. One may not only be interested in 
the overall main effect question. The research hypothesis in a particular 
research project might be that Tf the average of Treatment Groups 1 and 4 is 
different from the average of Treatment Groups 2 and 3" (in other words, whether 
the difference follows a quadratic trend). Given Model 4, it would simply require 
thf be set equal to 0 in order to answer this very specific question. 

As indicated above, the values in the vectors are standard coefficients for 
orthogonal polynomials. It may be that such coefficients do not reflect a 
particular question of interest. One might want to ask the question as to 
whether the effect of Treatment 1 equals the average effect of Treatments 2, 3, 
and 4. Since the standard coefficients for orthogonal polynomials do not 
reflect this particular question, it would be necessary to establish a 
different set of coding coefficients. Since the question as to whether 
Treatment 1 equals the average of Treatments 2, 3 and 4 would require coding 
coefficients in the predictor vectors to reflect the differential weighting 
of the Treatments, an appropriate set of coding coefficients might be: 

Treatment 1=3; Treatment 2 * -1; Treatment 3 « -1; Treatment 4 = -1. The 
values in vectors , X 2 , and of Model 4 might then be as follows: 
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y 
'*1 

If criterion from 
Treatment 

1 3 

2 -1 

3 -1 

4 -1 

In order to answer this question of interest > it would only be necessary to 
restrict out vector by setting a^ = 0. 

The two examples presented above so.ox? to point to two advantages which 
accrue from the use of contrast coding in a one-way analysis of variance. First > 
since the predictor vectors are all independent > the number of predictor 
variables in a model accurately reflects the degrees of freedom for the analvsis. 
As was pointed out above, this is not the case when standard 1 and 0 group 
membership vectors are used. Second, the use of contrast coding allows one 
to ask more specific questions of interest than the overall main effect. 

The importance of these two factors becomes even more apparent when one considers 
a two-way analysis of variance. 

Consider an experiment in which a 2 x 3 factorial design is to be applied 



y 



X, 



0 

2 

-1 



0 

0 

+1 



and assume that there are two levels of condition A and three levels of condition 
B. Winer indicates that the following linear model would account for all 
sources of variability contributing to an individual score: 

X ijk “ y + a t + Bj + a6 ±j + e ijk 

Where: X = an observation on person fc under treatment i and 
treatment j 

\x = grand mean of all potential observations 
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- main effect for condition A 

- main effect for condition B 

■» effect of interaction of conditions A and B 

e = error associated with X. .. 

ijk ij k 
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The various sources of variability in Winer's model can he duplicated in 
standard multinle regression analysis, Wowevar, the need for a test of inter- 
action renuires a full model which allows the differences between cell means 
to vary and a restricted model which would force the differences between cell 
means to be equal. While this is not a particularly difficult task, it does 
require some rather lengthy algebraic manipulations of the partial regression 
weights. Kelly, Beggs, McNeil, Eichelberger and Lyon (1°69) include an excellent 
presentation of the procedures for performing a two-way analysis of variance in 
standard regression analysis so we will not attempt to duplicate it here. 

Using contrast coding to duplicate the 2x3 analysis of variance would 
require the following full model: 

Y - a 0 Ti + a 1 x 1 + „ 2 X 2 + * 3 X 3 + * 4 X* + « 5 X S + E 5 

Y = criterion vector 
U = unit vector 

= 1 if subject from or -1 if from subject A 2 

X^ = -1 if subject from B • 0 if subject from or 1 if 

subject from B . ^ 

3 

X^ = 1 if subject from -2 if subject from S 2 or 1 if 

subject from 

- X ^ multiplied by X 0 
X 5 = X X multiplied by 

- error vector 

through = partial regression weights 

It will be noted that the elements of X 0 and X q reflect the linear and quadratic 

trends for the B main effect. In addition, the coefficients in X, and X c would 

4 5 

reflect the linear and quadratic components of the interaction effect. It 
should also be noted that all five vectors are independent so that the number 
of predictor vectors accurately reflects the total degrees of freedom for this 

two-way analysis of variance. 

O 




^odel 5 

Where : 
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The overall rain effects and interaction tests can be simply done once 

Model 5 has been established. In order to test for interactions one need only 

n 2 

set a, = a =0 and connare the P" of '-odel 5 to the R of the resulting restrict 
4 5 

model. In order to test for a signf-f leant A main effect one need onlv restrict 
X from ' r odel 5 by setting - 0. To tost for the 3 main effect, X^ and 
must be restricted from Model 5 by setting ~ 0. Each of these tests 

of significance can be shown to exactly duplicate the results one vjould obtain 
through the use of traditional tvo-^ay analysis of variance equations. 

As was the case with a one-way analysis of variance, the use of contrast 
coefficients allows one to ask questions of interest other than the overall 
main effects and interaction. In the example above* suppose one were, interested 
in determining if the interaction contained a significant quadratic trend. 

This variable of interest is reflected in X c of M odel 5. In order to test for 
a significant quadratic interaction trend, one need only set a^ = 0. The linear 
trend of the interaction could be tested by setting a^ = 0. Further, Model 5 
allows one to test for significant linear and quadratic components of the 
R main effect by setting a 2 - 0 and a^ =* 0 respectively* The use of contrast 
coefficients in this linear regression analysis would allow one to examine any 
one or all of the five independent sources of variance which the total degrees 
of freedom indicate contribute to each individual criterion score. In addition, 
one could ask other questions of interest by establishing a set of contrast 
codes which would allow the specific question of interest to be reflected in 
the predictor vectors. 
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Analysis of Covariance 



'Tie application of the use of contrast coefficients for analysis of 
covariance is a natural extension of the analysis of variance. The covariate 
or concomitant variable is entered as a predictor along with the treatment 
variables in the linear equation. Tor example, if a covariate were included in 
M odel 5 a^ove the equation would become * 



The nature of the equation changes slightly, however, in that the predictor 
variables (X^ through X & ) are not all orthogonal to one another. Specifically, 



variables of interest (X-^ through X 3 ) . When the restriction « a 2 * a 3 = 0 
is placed on the equation eliminating the. treatment source of variance, the 
weight associated with the covariate (a^) will change in value. It can be shown 
that the variance which is lost by such a restriction is that variance which is 
associated with the treatment but rxrhich is independent of the covariate. In 
other words, the restriction results in a loss of that variance which is unique 
to the treatment variables (X^ through X^) . Such analysis is identical to 
analysis of covariance as described in such textbooks as Winer (1^62) , Lindquist 
(1^53) and McWemar (1969). The interpretation made for a significant statistical 
test for treatment effect obtained by the analysis is that the treatments have 
an effect on the mean criterion scores over and above that which is accounted for 
by the covariate. The usual procedure of using group membership vectors in the 



<odel fi Y = a 0 U -h a 1 X 1 + a ? X 2 + 




+ E 



~6 



Where: Y = criterion scores 



through X 3 are treatment variables of interest 

X, = the covariate or concomitant variable 
4 



U = unit vector 



a through a, = partial regression weights 

0 4 



there is a real or sample covariance between the covariate (X^) and each of the 



ERIC 



10 



-in- 



line?, r model also duplicates the analysis of covariance for one-way ANCOVA designs. 
In fact, the only advantages for using contrast coefficients rather than group 
membership vectors seem to be that (1) contrast coefficients provide a more direct 
count of independent vectors to obtain degrees of freedom and (2) contrast 
coefficients allow for tests of more specific questions concerning treatment 
effects than does the. use of group membership vectors. 

Finer (1962) indicates that the linear model for a two— factor ANCOVA 



would be as follows: 



X iik = ^ + a i + S ij + aB ij + Y vV- + C ijk 



Fhere: X ±jk 



an observation on person k under treatment i in 
condition j given information on the covariate 



\x = grand mean of all observations 

ot * effect of the ith treatment 
i 

8 - effect of the J th treatment 

i 



a8 1 - effect due to interaction 

il 

y = regression effect on the covariate 
vk 

e = error associated with X 
ijk ilk 

Suppose, now, that we wish to utilize a model VThere the effect contains 
two different conditions and the effect consists of a control group (B^) 
and two experimental groups (B and B^) . Then the more traditional regression 



model for these effects with the ccvariate and interaction included would bet 



Model 7 v = a 0 U + + V] B 3 + + a il A 2 B 2 + “eV *7 

Where: Y = vector of criterion scores 

U = unit vector (nil elements are 1) 

A3 * 1 if observation is found in both A. and B , 0 otherwise 
i j 1 5 

Xq = concomitant variable 

a. and a, through a = partial regression weights 
0 6 1 *~ 
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Xn order to test the interaction effect, one would restrict Model 7 to. 

Model 8 Y = a 0 O + ^ ^ Q + E g 

Where: Y = vector of criterion scores 

U = unit vector 

= 1 if criterion from A^ • 0 otherwise 

A s i if criterion from A^: 0 otherwise 

= 1 if criterion from 0 otherwise 

B = 1 if criterion from E • 0 otherwise 

2 2 

B 3 * 1 if criterion from B^; 0 otherwise 

X n - concomitant variable 

R = error vector 
S 

a through a, = oartial regression weights 
0 ' o 

Then the test for interaction (R? ~ R~) would be a test of whether the proportion 

f 8 

of variance unique to interaction is significant. 

There is some disagreement among researchers as to procedures for testing 
main effects following a non-significant test for interaction. Both Ferguson 
(1971) and Winer (1962) suggest that after finding a non-significant interaction 
effect, one has the option of treating the interaction sums of squares as error. 

The sums of squares for interaction could, along with the appropriate degrees of 
freedom, he pooled with the sums of squares error to form a more stable error 
estimate. In another paper presented at this convention, Pohlmaim (1972) discusses 
the limit to which such pooling may aid in guarding against a type II error. 

Kelly et.al. (1969) encourage the practice of pooling as discussed in the 
previous paragraph. Assuming one has chosen to pool, then the A effect could 
be tested by restricting a^ and a^ from Model 8 equal to 0 and the subsequent 
model becomes: 
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Yodel 9 Y = a n U + + ?- 4 B 2 + + E p 

Where: Y * vector of criterion scores 

U = unit vector 

Eg = error vector 

and = defined as in Model 8 

X = concomitant variable 
0 

a n „ a„, a., a , a* - partial regression weights 
U ’ 3 4 5 n 

(P 2 - r 2 ) would seem to be equal to the A main effect whereas 1 - R would consist ° 
8 a » 

a pooled error term which includes the interaction effect. The B main effect 
would be tested in a manner similar to the test for the A effect. 

This , however, does not duplicate the main effect that is found in 
traditional two factor ANCOVA as described in Winer (1962). In the model: 



\jk * u + “i + e .i + “ e y + T vk + 

3 , a ®ij are ort ^°^ ona ^ to onG another and* hence* the presence or absence of 
anv one should not have any effect on the others. However, this is not the case 
in the presence of the covariate. The covariance patterns between a ^ 5 B j , and 
otBjj , with the covariate y seem to be of such a nature that the restriction of 
any of the three effects equal to 0 results in a change (increase or decrease) 
in the other remaining effects. This would not be the case without the presence 
of the covariate nor does it effect a one-way ANCOVA* Thus, x/hen the interaction 
term is pooled with the error in order to test a main effect, the amount of 
variance associated with that main effect is different from what it would have 



been without pooling. 

The use of contrast coding in two— factor ANCOVA would provide a method of 
analysis where one could very easily test the main effects without pooling the 
interaction, thus yielding a duplicate result to the traditional two-factor ANCOVA 
as discussed by Winer (1 Q 62). Furthermore, contrast coefficients allow for tests 
of more specific questions of interest. 



0 
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Given the example presented above, where the A effect consists of two 
conditions and the 5 effect consists of one control group (B^) and two experimental 
groups (B^ and B^) , the experimenter might be interested in a comparison of the 
experimental groups of the B condition to the control group (B^) as the first 
question of interest. A second question might be if there is a difference 
between the experimental groups over both A conditions. Then, one may he 
interested in whether either or both of the B experimental effects are different 
within the two A conditions. 

Note that in all questions, the interest lies in the effect of treatment 
over and above that of the concomitant variable. 

The model appropriate for this ANCOVA would be as follows: 

Model 10 Y = a^U + a^A + + a 5 AB 2 + a g X 0 + 



T-Jhere : 



Y = criterion vector 
TJ = unit vector 

A = 1 if in condition A^; -1 if in condition 

= 2 if in control group; -1 if in either experimental group 

B - 0 if in control group; 1 if in experimental group 1 ) ; 

^ -1 if in experimental group 2 (B^) 

AB^ = (obtained by A x B 1 ) - 2; - ~1; 



A 2 B 1 " " 2? A 2 B 2 " lj A 2 B 3 " 1 



AB 



2 = (obtained by A x B^) A^B = 0: A^B^ = lj A^B^ “ “lj 
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^2^1 ~ A2B2 * - 1 ; ^2^3 — ^ 

= concomitant variable 

E « error vector 
ID 

a through = the regression weight associated with the 
0 respective vectors 

There are three apparent advantages of contrast coding over the more standard 
use of group membership vectors. First, the number of parameter estimates are 
directly reflected by the number of weights (a^ through a^) used in the model and, 
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hence, lead to a more direct count of degrees of freedom. Secondly, one can go 
directly to tests of the questions of interest by restricting the appropriate 
weight of Modal in. For the four questions of interest specified above this 
would result in four restricted models by first setting = 0, followed by = 0, 
= 0 , and finally = 0. Each of the resulting restricted models would be 
compared to Model 10 above. One could still test for an overall interaction 
effect or for either of the main effects (A or B) , by simply restricting all 
weights for the appropriate vectors equal to 0. The third advantage of contrast 
coding is that it allows for tests of the main effect without pooling the error 
term, thus precisely duplicating the two factor ANCOVA as presented in Winer 
(1962). While such a traditional analysis may not be superior, the authors 
suspect that the difference in the two analyses would lead to somewhat different 
conclusions. That is, the traditional ANCOVA and the use of contrast coefficients 
analyze variance which is independent of all other sources in the model: whereas, 
the use of standard group membership vectors yields variance components that arc 
in some wav common to the interaction. 

Summary 

In this paper, we have shown how the use of contrast coefficients in multiple 
linear regression models can provide for a logical method of analysis in AN0VA 
and ANCOVA. Three distinct advantages were indicated. First, the number of 
estimated parameters are directly indicated in the model, thus leading to a more 
natural and direct count for degrees of freedom. Second, contrast coding allows 
for the testing of specific variables of interest other than the overall main 
effect and overall interaction effects. Finally, in the case of two-way ANCOVA, 
contrast coding does not require pooling interaction with the error term and 
thus is an exact duplicate of ANCOVA as presented in Winer (1962). 
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It would 



seen 



^ t the use’ of cone r ist coefficients allow for - a vamiety 
o types of analysis within the general linear model. T his would present futur . 
researchers with a more integrated concept of data analysis rather than to 
contribute to fragmentation of the field by discussing regression as separa—- 
from ANOVA with all its various subcategories. The use of contrast coefficients 
encourages researchers to ask specific Questions which can be analyzed with 
F-tests which have only one degree of freedom in the numerator. IJhen there is 
only one degree of freedom in the numerator, the researcher is in effect dealing 
with a single source of variability, and as a result, is able to better interpret 
the meaning of the test of significance. In overall main effects or interaction 
tests, the numerator generally, has more than one degree of freedom in the numerator 
The researcher must then attempt to interpret the test of significance realizing 
that he is analyzing several sources of variability simultaneously. 
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